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Abstract 

An (algebraic) automorphism of the 2-torus is defined in a standard way by a 
matrix with determinant 1 or —1 and with integer coefficients. An automor- 
phism is hyperbolic, if the eigenvalues of this matrix are reals with absolute 
value > 1 for one eigenvalue (and < 1 for another). Iterations of such automor- 
phism A constitute a dynamical system (DS) with discrete time — phase points 
do not move continuously as it is for the DS described by differential equations, 
but jump from one place to another; the moving phase point which originally 
(at the zero moment of time) occupied the position x moves to A^x during the 
time n. Hyperbolicity implies that although formally this DS is deterministic, 
actually the behavior of its trajectories resembles, in a sense, behaviour of some 
random (stochastic) process. Markov partitions is the best method to establish 
this analogy which is even a kind of isomorphism. 

This text is based on the talk the first author gave in Germany, but the text 
is more detailed. It consists of four parts. ^ In the first part we explain how 
the deterministic DS can be isomorphic to a random process on an example 
(the circle expanding map) which is more simple. In the second part we dwell 
on the classification of hyperbolic toric automorphisms. In the third part we 
define the notion of Markov partitions and explain how they can be used and 
how one can construct a simplest Markov partition (perhaps some details of the 
construction can be somewhat new). Finally, in the fourth part we describe a 
kind of classification of these simplest Markov partitions (this is new) . 

Parts 2, 3 and 4 are based on the work of A.V. Klimenko and G. Kolutsky 
who are Ph.D. students of D.V. Anosov. Besides him, in the beginning of their 
work their inofficial scientific advisor was A.Yu. Zhirov. Part 2 is an exposition 
of results which seems to be known in the number theory; the version presented 
here was elaborated by G. Kolutsky. Parts 3 and 4 is mainly due to Klimenko; 
the idea of using results and notions from the Part 2 for the goals of Part 4 
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was a result of his discussion of the matter with Kolutsky; also, they examined 
several first examples together. 
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1 Introduction 

Two big parts of the theory of dynamical systems can be characterized as deal- 
ing with motions of "regular" and "stochastic, quasi-random, chaotic" charac- 
ter. Simplest examples of regular motions (and those which are, informally, 
"the most regular") are periodic or quasiperiodic motions. (Thus considering 
of regular motions is as old as the science itself some regularity of planets' 
motion was known and exploited by Babylonians, and in the more advanced 
Ptolemeus' system these motions were essentially described by trigonometric 
polynomials.) Examples of "chaotic" motions arc much more new. As far as 
we know, the first example of such kind was pointed out by J. Hadamard about 
1900. A couple of decades earlier H.Poincare discovered the so-called "homo- 
clinic points" which now serve as practically the main "source" of "cliaoticity" ; 
however, Poincare himself spoke only that the "phase portrait" (i. e. the quali- 
tative picture of trajectories' behaviour in the phase space) near such points is 
extremely complicated. A couple of decades after Hadamard E.Borel encoun- 
tered a much simpler example of the "chaoticity" where it is easy to understand 
the "moving strings" of this phenomenon. We shall begin with a description 
of his example. About 100 years later it remains the simplest manifestation of 
the fact that a dynamical system (which, by definition, is deterministic) can 
somehow resemble a stochastic process (in fact, even be, in a reasonable sense, 
isomorphic to such process). 

In this example the phase space is the circle §^ = R/Z. We shall often 
speak that R projects onto by the projection p. We can consider the usual 
coordinate a; in M as a "cyclic coordinate" on . In its terms we define the map 

f:S'^S\ f{x)=2x. (1) 

More formally, we begin with the map 

g:R^R, x ^ 2x 

and project it onto (so p{x) p{2x); we use the fact that points 2x and 
2{x + n) {n is an integer) project to the same point of . More formally, we use 
that gCZ) C Z so that g maps the class x + Z to the class 2x + Z.) Pictorially, 
considering as made from rubber, we stretch it to double its length and then 
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cover the original §^ by this expanded circle (so each point of the initial circle 
is covered by two points of the expanded one).^ 

Our dynamical system consists of iterations {/"} of /, so that any of its 
trajectories is a sequence {/"(a;), n £ Z+} (here, in Bourbaki's style, + is used 
to deceive a spy; actually Z+ = {0, 1, 2, . . .}). Thus it is a system with discrete 
time (n plays the role of time — during time n the moving phase point "jumps" 
from the original position x into the position /"(x)). 

Remark: One can inquire whether it is possible to construct a system with 
continuous time exhibiting "chaotic" properties analogous to those we are go- 
ing to discuss for our {/"}; and whether there exist dynamical systems with 
chaotic behavior of their trajectories among those systems of the most classical 
character — those described by phase velocity vector fields u on a smooth phase 
manifold M (the moving phase point moves accordingly to the differential equa- 
tion X = f{x) which in terms of local coordinates looks as a "habitual" system 
of autonomous differential equations). The answer is positive. Essentially first 
examples of siich kind were found in the process of improving Hadamard's re- 
sults. But for all such systems the phase space is unavoidably of dimension not 
less than 3 and they are much more complicated than Borel's example. 

Another question preceding discussion of any concrete properties of Borel's 
example is the following. In this example the map / is irreversible; so we can 
speak about the future motion of the moving phase point (it occupies position 
X, then /(x), them /^(x), and so on), but we can't speak about its position 
for negative time n. Is it possible to construct "reversible chaotic" examples? 
Basically the positive answer to the previous question indicates that this is 
possible (in "classical" dynamical systems the time is reversible), so that the 
question can be only for dimension of the phase space less than 3. This can be 
achieved if we pass from the continuous time to a discrete one. Actually the 
main content of this paper will be related to the simplest example of such kind. 
Reversibility is gained at the price of increasing the phase space dimension — 2 
instead of 1; namely, we shall deal with a smooth automorphism of the 2-torus. 
But we begin with Borel's example, as it is more simple. 

From now on till the end of this part / means Borel's / defined by (1). If we 
knew X precisely, we could compute its trajectory {/"(a;)}. But assume that we 
know the phase point we have to deal with only approximately, although with 
a good approximation. So instead of the "true" trajectory {/"(x)} (or {2"a;} 
in terms of the cyclic coordinates) we compute the trajectory {/"(y)} = {2"t/} 
with some y at the small distance 5 from x. The distance between /"(x) and 
f"{y) is 2" 6. For several first numbers n the error is small, but it rapidly 
increases with n. Without entering into refinements of the terminology, this can 
be called instability, and even a strong one — roughly speaking, this kind of 
instability means that two phase points which originally were close to each other 
can rapidly diverge under the action of the iterations (More technically, 
such type of instability is called exponential, uniform and complete; we shall 

^This / is cui example of the so-called "expanding difTeomorphism" of S^. We shall not 
need to define this class of maps, as we shall deal with / only. But on the "conversational 
level" it is clear that / deserves to be called "expanding". 
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not dwell on this.) If S is of the order 10~* (the size of atom in centimeters), for 
n = 30 the error will be of order 10, i. e. of the macroscopic order — of the same 
size as the laboratory equipment or (returning to our example) as our circle §^ 
(formally, even more than it). Then all what we can say is that the moving 
phase point /"(a;) is somewhere on the circle — a trivial conclusion which can 
be made without any measurements and calculations. 

Besides this "growth of uncertainty" which comes to attention when we 
compare the behaviour of two different trajectories f"{x) and /"(y) (with x « 
y), behavior of the most part of individual trajectories /"(a;) also demonstrates 
such features which make it reasonable to characterize their behavior as a chaotic 
one. We shall see this later. 

About 1910 Poincare wrote that in such situation instead of the more or less 
exact computing the "individual" trajectory (which is practically impossible) 
one can try to make some statistical statements concerning some features of 
behaviour of a "majority" of trajectories or of the "typical" trajectories. In- 
stability, in his opinion, was the source (which can be a hidden source) of the 
probability. 

We suspect that besides Poincare some physicists also shared this point of 

view at that time (very end of XIX — beginning of XX century). But, in any 
case, he expressed it quite distinctively and illustrated it on some mathematical 
example. We shall not dwell on it because the later Borel's example provides a 
better illustration which at the same time is more close to the goal of this paper. 
(In Poincare's example individual trajectories were not chaotic and the distance 
between /"(a;) and /"(y) was growing more slowly than in Borel's case.) 

Now we know that besides instability there exists at least one source of the 
random behavior, that is, quantum effects. But this does not abolish those 
effects which are due to the instability and so emerge even in the classical 
situation. 

Actually Borel spoke not about the circle map /, but about the interval map 

[0, 1) ^ [0, 1), a; I— > {2a;} ({ • } means the fractional part). 

This map has a disadvantage of being discontinuous at the point x = 1/2. 
For the reason to be explained below this discontinuity did not trouble Borel. 
However, we see that we can easily get rid of it — just replacing [0, 1) by 

In the original Borel's version it is especially clear that the map / is quite 
lucidly described in terms of the expansion of x into infinite binary fraction. If, 
in these terms, 

X = 0,aia2a3 . . . with all aj being or 1, 
which means that 

2 ^ 22 ^ 23 ' 

then f{x) = 0,020304 . . .. The comma separating the integer part of the binary 
fraction from its fractional part is moved one step to the right and all that 
becomes to the left of the shifted comma is replaced by zero. One can also say 
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that the comma's position is fixed, but the infinite sequence aia2a3 . . . shifts 
by the one step to the left and the coefficient ai (appeared to be to the left 
of the comma) is discarded (i. e. replaced by 0). The binary expansion of x is 
not unique for binary-rational x (e.g. for those of the form x = 2°!'°'' ) ■ But it 
is harmless, because if two binary expansions represent the same x, the shifted 
binary expansions represent the same f{x). 

In terms of the circle S"'^ one can interpret the binary expansions as follows. 
Points p{-^) [i = 0, ... ,2" — 1) divide §^ into 2" arcs, (i + l)-th arc con- 
sists of points p{x) obtained when x increases from ^ to i.e., this arc 
is p ([^, ^])- Let us denote this arcs as follows. If 6^ . . . 6i6o is the binary 
representation of i, we define hk+\ = hk+i = ■ ■ ■ = bn-i = and then associate 
with each i = 0, 1, . . . , 2" — 1 the sequence ■ ■ ■ , 6o- E. g., binary repre- 

sentation for i = 3 is 11, and if n = 4, we associate with 3 the finite sequence 
0011.) Having in mind this correspondence between numbers i and sequences 
bn-i ■ ■ - bo, denote 

P\ — 1 =Ch 

Then^ 



2" 2" 



i...bo- 



p{x) S C6„_i...6o if and only if a; = 0,6„_i . . .bo* . . .* 

A point with binary rational cyclic coordinate has two binary expansions — say, 

0,ai . . . flfeOl . . . 1 . . . and 0,oi . . . 0^10 . . . (2) 

If fc > n, first n coefficients of these expansion are the same, and so for both 
expansions our receipt says that p{x) G Cai...a„- k < n, the point p{x) is the 
endpoint of two adjacent arcs Cci...c„, and their labels Ci . . . c„ will be first n 
digits of one or another binary expansion (2). 

This geometric characterization of the binary expansion of x is, so to say, a 
"static" one. But it is easy to pass to a "dynamical" characterization of this 
expansion: 

X = 0,ai if and only if p{x) G Cai , 

X = Q,aia2 if and only if p{x) e , f{p{x)) e C^^ , 

(recall that f{p{x)) = 0,a2 *...*...); 

X = 0,ai .. .a„ . if and only if p(a;) G Ca^, . . • , /"~^(p(a;)) e Ca„, 



Of course it is only the sequence {a„} that is important, not the zero and 
comma standing before them. Slightly modifying what was said earlier (and 
deviating from literally following Borel), we can adopt the following agreements. 
Instead of numbers x S [0, 1) we shall begin with (singly-) infinite sequences 
(ao, . . . , a„, . . .) of numbers (or symbols) ai € {0, 1} (now we start numbering 

^Here and below * denotes an arbitrary digit. 
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them from 0; advantage of this is that now a„ is the number of the semicircle 
Ci containing /"(a;)). Denote by Q the space of all these sequences (i.e., ^ = 
{0, 1}^+). Word "space" hints that O will not be merely a set, but that it will 
be endowed with some structure. There will be two structures on fl: topology 
and measure. 

As regards to topology, we take the discrete topology (each point is an open 
set) in each multiplier {0, 1} of the infinite product {0, 1}^+ and then endow this 
product by the Tikhonov product topology. According to Tikhonov theorem, 
is compact as a product of compact spaces. In this case the topology on Cl is 
induced by some metric, e.g. one can take 

Pix,y) = X]~^pTr^ ^ = ixo,xi,...), y = iyo,yi,...), 

n 

where d(a, 6) = for a = 6 and d(a, 6) = 1 for a ^ h. Using this metric, one can 
easily prove compactness of f2 without referring to the general theorem. 

Subset ^ C is called a cylindric set if it consists of all sequences x such 
that some prescribed coordinates , . . . , Xi^ of x are given numbers ai^, . . . ,ai^, 
while other coordinates are arbitrary. Cylindric sets are open in the topology 
used; moreover, they constitute a base for this topology. They are also closed 
— existence of so many open-closed sets means that fl is zero-dimensional. 

As we've started to speak about products, we shall sometimes call the n-th 
element Xn of the sequence x — {xq, xi, . . .) its n-th coordinate (once more, they 
are numbered beginning from the 0-th coordinate). 

Binary expansions were binary expansions of the cyclic coordinates of the 
points of S^. In our new language we introduce the map 




It is a continuous map. There exist a countable set of points having two preim- 
ages, but for the "vast majority" of points there is only one preimage. Multi- 
plying cyclic coordinates by 2 is now replaced by the "one-side Bernoulli shift" 
(7 moving the whole sequence to one step left and omitting its first symbol; that 
is, 

for a; = (a:o, Xi, . . .) a{x) — {yi,y2, ■ ■ where y„ = a;„+i for all n e Z+. 

It is clear that tt o a = / o tt. In this sense one can say that our construction 
provides a "symbolic model" for our original map / : ^ S^. 

Point X and its trajectory {/"(x)} are "coded" by a sequence (ao,ai,...) 
(once more: n-th element of this sequence is such number that f "{x) € Ca„). 
This sequence could be called "a journey diary of x" . Yu.S.Il'yashenko uses the 
more impressive name "a fate of x" . Below we often call this sequence simply 
"a code of x" . 

This trick — "diary" , "fate" , "coding" — is by no means restricted by our 
example. If some set X is decomposed into nonintersecting sets 

X = XiU...UXk, XiDXj = iov j, (4) 
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then for any map f : X X wc can introduce "a journey diary" of a point 
X € X (with respect to the decomposition (4)): this "diary" is an infinite 
sequence (a„; n G Z+) such that /"(x) e Xa^. Of course, the decomposition 

(4) must be somehow adjusted to the structures which are specific for example^ or 
a class of examples we are going to consider (and which are somehow respected 
by /). Besides this general demand, a special choice of the decomposition used 
may take into account more specific properties of /. Also, in our case this 
general approach is slightly modified. Essentially we are using the partition 

= Co U Ci which is not a decomposition in the strict sense: Co U Ci ^ 0. As 
a result, the encoding the point x by sequence (a„) does not always supply us 
with a single valued function x i-^ (a„): some points of §^ (those with binary- 
rational cyclic coordinates) have several (two) "journey diaries". This would 
not happen if we took Co = P ([O, ^)) . Ci = P ([O, 5)). On the language of 
the binary expansions, this would mean that we rule out expansions of the form 
0, oi . . . Ofcll . . . 1 . . ., i.e. those to be periodic after some place with the period^ 
consisting of one digit 1. However, practically one uses such binary expansions 
and we shall also use the closed arcs Cj. 

Our "journey diary" can be described in accordance to a general remark 
above in terms of dynamics and partition = Co U Ci , without appealing to 
binary expansions: 

x (a„) if and only if /"(a;) G Co„ for all n G Z_|_. (5) 

This makes evident that if a; i-^ a = (oq, ai, 02, . . .), then f{x) (ai, 02, 03, . . .). 
But essentially we have also used the binary expansions in the definition of the 
map (3) inverse to the (multi- valued) coding x (a„) (which makes it evident 
that any sequence (a„ ) codes some x). Here it is also easy to get rid of them. 

(5) is equivalent to 7r((a„)) G fl^^Lo /~"(Can)> i-e- 

TV 

for all iV G Z+ 7r((a„)) G fj /""(C^J. (6) 

n=0 

Define Fn = fl^^o /""(Ca,.)- Clearly Fq D Fi D . . . D Fn D . . .. It turns out 
that ^ 

Fn is a closed arc of the length ^77+1 • (7) 

This implies existence and uniqueness of the point common to all Fj^. This 
implies also the continuity of n. Indeed, if /?((a„), (5„)) is small, which implies 
that a„ = bn for all n = 0, 1, . . . , A'' with some big N, then both 7r((a„)) and 
7r((6„)) lie within the same arc F^ of the small length ^n+i ■ 

As regards to (7), it can be proved as follows. Clearly /~"(Co) and /^"(Ci) 
are disjoint unions of 2^ closed arcs of the view [^tht, ^tt] with some i G 
{0, 1, . . . , 2"+^ — 1}, i being even for arcs from /""(Cq) and odd for arcs from 

^Here and later we shall often use the word "period" as denoting the periodic part of the 
infinite sequence, not merely the length of this part. 
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/ "(Ci) (/" maps homeomorphically any such arc with an even i onto Co and 
with an odd i — onto Ci). Any arc [2?^ft, fsrr] consists of two arcs of the form 

" 2j 2j + l 

Thus if we already know that Fpf is an arc of the type described (which is trivial 
for N = 0), then passing to Fn+i means that we pass to one of the arcs (8) (to 
the first arc if oat+i = and to the second arc if a^+i = !)• 

Our map / is very simple, and at the first glance it is not clear whether our 
symbolic model is useful for any purpose. We shall see that it is. 

It turns out that one can introduce a measure fi on such that yu(A) — ^ 
for any cylindric A defined by fixing n coordinates. (Of course dealing with the 
topological space we consider only measures which are in a sense compatible with 
topology. In our case when the space is a metrizable compact set this means 
simply that all Borel sets are measurable.) Existence of such measure is a simple 
case of some general theorems of the measure theory and/or of the probability 
theory, but in this case argumentation can be much more easy. Consider first 
the cylindric sets of the following special character: they are defined by fixing 
first n coordinates of their points; i.e. wc speak about the sets 

-Bao,...,o„_i = {x= {xo, xi,...); Xo = ao,..., x^-i = a„_i}. 

This set is mapped under tt on the arc Caa,...,an-i- The length of this arc is 
equal to 1/2" which is just what we want to be the measure of -Bao,...,a„_i- 
Going further, we observe that any cylindric set A is a finite union of the sets 
Bao,...,an-i ^iid TT maps such union onto a finite system of arcs considered. It is 
easy to check that the total length of these arcs is just what we want to be ^(A). 
And this gives us an idea how to define fi: we simply define it as the prcimage 
of the standard Lebesgue measure (denoted by mes) on §^ (or, if you prefer, on 
[0, 1) — the Lebesgue measure does not feel the difference between them which 
is due to just one point) under the map tt. Although tt is not a bijection, the 
violation of bijectivity is negligible from the measure-theoretic point of view. So 
TT is an isomorphism of the measure spaces (ri,/x) and (§^,mes). 

An important property of this measure is that for any measurable set A C O 
its preimage a~^{A) is also measurable (thus a is measurable) and 

i,{cj-\A))=t,{A). (9) 

In such cases one says that the measure ji is invariant with respect to a. (Liter- 
ally this expression would mean that ^i{a{A)) = //(A). But this is wrong. When 
dealing with any noninvcrtible map a, one always understands preservation of 

measure as the measurability of this map plus the property (9).) 

Basic fact here is that these two properties (measurabihty of (7~^{A) and (9)) are true 
for cylindric A. Let A be described by fixing coordinates Xi^ , ■ ■ ■ , of its points x (so 
= ^). Preimage a~^{x) consists of two points y and z. Both have the same coordinates 
which number is i > — namely, yi = Zi = Xi—i (indeed, after the shift of y and z towards 
one step to the left one must get Xi-i on the (i — l)-st place), while yo = and zo = 1 (thus 



2j + l 2j + 2 



>n+2 



2»+2 



(8) 
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no restrictions arc imposed on the zero's coordinate of the points of a~^{A) — it ean be 
or 1 and this has no influence on other coordinates). It follows that a~^{A) is the cylindric 
set such that restrictions on the coordinates are imposed on the coordinates Xij+i, . . . , Xi^-^.l. 
This is n coordinates and so ^{a^ ^ (A)) = ^rr = 

After this one can use more or less standard arguments from the measure theory. We shall 
repeat them making simplifications due to specific features of our case. Let A be the finite 
union of cylindric sets Ai, .... A, i. Then a~^{A) is a finite union of their preimages (t^^{A) 
which are also cylindric sets and thus measurable. This proves the measurability of a^^(A). 
Comparison of its measure with the measure of original A needs more considerations. Each Ai 
is described by fixing a finite number of coordinates — say, fixing coordinates xj with j £ Jj 
where Ji is some finite set of nonnegative integers. Let N = max(Ji U . . . U J„). Any Aj can 
be presented as a finite union of some sets of the form Bao,...,aff (Say, let the restrictions 
describing Ai be xi = 0,X2 = I and the restrictions describing A2 be = 1 and X4 = 0. 
Then Ji U J2 = {0, 1, 2, 4} and = 4. We have 

Ai = Booioo U Sooioi U Booiio U Boom U Bioioo U Bioioi U Bioiio U Bioiii, 
A2 = union of 8 sets Bi, 01,02,03,0 for all (01,02,03) G {0, 1}^. 

Finite union of A, is also a finite union of some Sao,...,ajv ■ these B... do not intersect each 
other and lJ,{a~^{Bao,...,afi)) = A'(Bao,...,ajv)> follows that i^{a~'^{A)) = IJ,{A). 
Now any open set U can be represented as a union of increasing sequence 

Ul CU2 C ■■■ CUn C... 

of the sets each of which is a finite union of cylindric sets. (So IJ.(U) = lim ij,(U„).) Then 

n — ►oo 

is the union of increasing sequence 

<T-\Ui) C a~\U2) C...C a-l(C7„) C . . . . 

Each <7~^{U„) is measurable (thus the union a~^{A) of these sets is also measurable and 
fj,(cr~^(A)) = lim fj,(cr~^ (U„))) and has the same measure as Un- It follows that /u(<t~^ (A)) = 

n — ^00 

^l{A). 

Next step is to consider closed A. As <t ^{A) = Q \ <t ^(H \ A), it is easy to see that 

(j~^{A) is measurable and its measure equals to ^(A). 

Finally consider arbitrary measurable A. For any e > there exist a closed set C and an 
open set U such that C C A C U and /i(!7) — /u(C) < e (in particular, \iJ,(U) — /i(A)| < e). 
Then ct~^(C) C <j~^{A) C it~^{U), the first set is closed, the last set is open and the 
difference of their measures is the same as for original U, C, i.e. it is less than e. The fact that 
a~^{A) contains some measurable set and is contained in some open set and the measures 
of these sets can be made arbitrarily close to each other, implies that a^^{A) is measurable. 
It follows also that \fi{a^^ (A)) — fi{a~^ {U)\ < £. And as fi{a^^ {U)) = IJ-{U), we see that 
\lJ,{cr~^{A) — /i(A)| < 2£. As e is arbitrary, we conclude that ij.{a^'-(A)) = /i(A). 

Now it is time to explain what was discovered by Borcl (not the description 
of the multipUcation by 2 in terms of binary expansions, of course). Borcl ob- 
served that the dynamical system a, ji) ^ describes the classical object of the 
probability theory — a sequence of independent trials consisting in flipping of 
a coin. This discovery was important for the development of the treatment of 
probability theory foundations on the base of measure theory^'. In full gener- 
ality this treatment was elaborated by A.N.Kolmogorov in 1930s and became 

^As we have already said, actually he spoke of ([0, l),x 1— > {2a;},mes), but this difference 
is not important from the point of view of his goal. 

^Borel's work was also influential in other respects (some hint on this will be given below), 
but at the moment we dwell only on one side of it which is close to our main topic. 
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standard. Having this treatment in mind, we can consider {Q, a, ii) as an early 
manifestation of this treatment appUed to the coin flippings. 

We shall use three basic notions: a random event, probability and indepen- 
dence. Essentially they cannot be defined in terms of notions from other parts 
of the science. They can be only illustrated on examples on semi-intuitive level. 
But the mutual relations of these notions can be described completely using 
other mathematical notions. Essentially this is the usual situation with basic 
notions in any part of mathematics^. 

First consider finite sequences of independent coin flippings. Say, let us flip a 
coin three times. An example of the random event: we have got after the first 
flip, 1 after the second flip, and after the third one. This can be denoted by the 
flnite sequence (0,1,0). This is an example of what is called an elementary event. 
In our case the elementary event describes the result of a flipping repeated three 
times. So there are eight elementary events described by 8 binary sequences 
(01,02,03) with all Oi = or 1. We can even adopt a formal point of view 
considering these sequences themselves as elementary events. Their collection 
{0, 1}"^ is what is called the space of elementary events. An example of a non- 
elementary event A: the sum of the numbers associated with three flips is 
odd. This happens if and only if the results of three subsequent coin flips are 
(0, 0, 1), (0, 1, 0), (1, 0, 0), (1, 1, 1). Thus we can consider an event as a subset of 
the space of elementary events. An event B consisting in being the result of 
the first flip and the sum of the numbers associated with 3 flips being odd is 
a subset of the previous A consisting of (0,0,1) and (0,1,0). Going further, 
we say that any result of a single fiip of the coin appears with the probability 
i. (This is practically interpreted that if we fiip the coin many times or if we 
flip many coins simultaneously, approximately half of these trials will have the 
result 0. Once more: from the point of view described this statement is not 
the definition of the probability, but merely a kind of intuitive explanation, or 
illustration, of this basic notion.) It is because the coin is assumed to be "fair", 
i. e. symmetric with respect to both its sides. Independence of the subsequent 
flips of the coin manifests itself in the fact that probability of any elementary 
event (00,01,02) is ^. 

We do not know whether there exist "false" coins such that the probabil- 
ities of and 1 are considerably different from i.* But there certainly exist 
loaded dices. According to the literature, they are even of some practical im- 
portance. If the dice is "fair", i.e. symmetric with respect to its faces and 

^Euclidus' claim that "a point is what has no parts" so often criticized as "naive, obscure 
and having no real content" is merely a naive way to say that in Euclidean geometry we deal 
with some sets (3-dimensional Euclidean space and its subsets) endowed with some structure 
described by the axioms and that points are just elements of these sets. As those, they really 
have no parts, Hilbert space H can well be some class of functions and functions themselves 
are rather complicated things; but as a point of H each function is considered as something 
what is "primitive, elementary, without intrinsic structure". 

® There are similar procedures with probability different from 1/2. For example, spinning 
of a newly-minted U.S. penny on a smooth table tends to show less "heads" than "tails" (as 
Lincoln's head overweighs another side). For some manners of spinning the probability of 
"head" can be as small as 0,1. 
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made from homogeneous material, then the probability of any of its faces to be 
shown after throwing of the dice is |. For loaded dice they are some numbers 
Pi)P2)P3)P4)P4,P6 such that all Pi > and '^Pi = 1. Assuming that we deal 
with a nonsymmetric coin, there is a probability po that the result of a flip of 
the coin will be and a probability pi that this result will be 1 . Numbers Pi are 
> and their sum po+pi = 1. In such case an elementary event (ai, 02, . . . , a„) 
has the probability PaiPa2 ■ ■ ■ Pa„ ■ 

Be the coin fair or not, after we defined the probabilities of elementary events, 
probability of any event A is just the sum of probabilities of its elements (of the 
elementary events belonging to A). So we get sonic structure on the space of 
elementary events. Speaking solemnly, it is a measure defined there. 

We can flip a coin 3 times but pay attention only to what happens at first 
two flips. This means that we take an evident projection ^ 

p: {0, 1}^ {0, 1}^ p(ai, 02, as) = (ai, 02) 

and pay attention only to those events — subsets of {0, 1}^ — which are preim- 

agcs of subsets of {0, 1} (essentially, of those events which happened during the 
first two trials). Using the analogous projection 

pi: {0, 1}^ ^ {0, 1} p(ai,a2,a3) = ai, 

we can say that in the previous example with events A, B 

B =p-i{0}n^. 

Idealizing the reality, we shall consider infinite sequence of a coin flips. An 
elementary event is now a result of such sequence of trials; it can be described 
by an inflnite sequence (oo, ai, 02, . . .) of symbols 0, 1. More formally, we shall 
regard these sequences themselves as elementary events. It will be convenient 
to us to make a slight modification of what was said and to assume that the 
coin is lying before us and we see what face is above at the moment; let ao 
be the number associated to this face. An elementary event from now on is 
an infinite sequence (og, ai, . . . , a„, . . .) where, once more, Og is what we see 
at the very beginning (at the moment zero) and a„ is the result of the n-th 
trial — assuming that the trial is made every second, it is what we shall see 
in n-th second. Then {0, 1}^+ is the space of elementary events. Earlier we 
had a notion of a cylindric set. Such sets appearing when we are fixing some 
coordinates, — say, coordinates with numbers Zi, . . . , z„, — correspond to the 
point of view when we are interested only in what was the result not of all trials, 
but only of the trials with numbers ii, . . . , i„,. Using the evident projection 

f2— >{0, 1}" sequence (aij; i ^ {xi^, . . . ,Xi^), 

we see that cylindric sets are preimages of elementary events from {0, 1}" under 
this projection. (Note that n can be different for different cylindric sets.) 

^Don't confuse it with the map R — > §^ also denoted by p. 
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Cylindric sets certainly must be considered as events (to sec such and such 
faces in such and such moments of time is certainly a rather elementary kind 
of event). If restrictions are imposed at n moments of time, the probability of 
the cylindric set is ^ , if the coin is "fair" . For an "unfair" coin the probability 
is Pai^ . . -Pai^i i- e. if k of the numbers ai- are (and n — k are 1), then the 
probability is PpPi '^- After this one can define the notion of the probability for 
some more complicated subsets of O. Essentially it is the same process which 
can be used for defining the measure above, have not we done this differently 
— defining as the preimage of the standard Lebesgue measure mes under 
the map (3). In any case, for "fair" coin we already have a desired measure 
at our treatment — this is just constructed above. For an "unfair" coin 
we have to do some work which we shall omit. By the way, in this case one 
can again receive n as the preimage of some measure on but this measure 
on is not the well-known Lebesgue measure, but some Lebesgue — Stieltjes 
measure. In many textbooks a construction of such measure on the base of a 
given distribution function is described; taking this as granted, we can easily 
pass to /U — we mainly have only to describe the distribution function which we 
need, and this is relatively easy. Of course, in both cases one can avoid going 
into details with /i simply because they arc cissentially contained in the more 
well-known construction of the Lebesgue measure or of the slightly less well- 
known Lebesgue-Stieltjes measure. The latter construction, which historically 
was the prototype of analogous and more general constructions; also begins from 
the most elementary case ( "measure of an interval is its length" ) and then goes 
step by step to more general sets. Simplification in our case is due to the fact 
that we need not imitate this construction but can use in a formal way results 
of this construction carried over on or, what is the same, on [0, 1). 

And now we can finish comparing of our dynamical system with the random 
process of the coin flips. A random fimction is a measurable function on fJ. A 
random process is a sequence of random functions ipn{x) is what we shall 
observe at the moment n provided an elementary event x is realized. Denote by 
^ a function on which is simply the projection on the zeroth coordinate. Then 
the result of the n-th flip is ^(c7"(x)). It is a sequence of numbers describing 
to what of our semicircles Co, C\ comes the moving phase point (jumping every 
second from x to f{x)) at the moment n. 

Borel showed how the notions and facts of the measure theory^° in order to 
define in a reasonable form the notion of probability for a rather broad class 
of events (subsets of il). This allowed to study problems such that the whole 
infinite sequence of trials was involved in a more essential way than before. 
Borel's strong law of large numbers was the first example of this new trend, 
which turned out to be fruitful. This is what we had in mind saying that 
Borel's impact on the foundations of the probability theory was only one side 
of his work (but, of course, these sides were closely tied). 

But at the same time Borel encountered an example of the "chaoticity" in the 

^"Needless to recall that it was he who started a fruitful work towards creation of this 
theory, disregarding earlier attempts which were much less satisfactory. 



12 




theory of dynamical systems. This was not understood in his time — one more 
manifestation of the chaoticity in this area. The fact that there are dynamical 
systems which are, so to speak, "intrinsically chaotic" (chaotic due to their own 
dynamics, not because of exterior perturbations) and the mechanism making 
them chaotic^^ were understood much later, in 1960s. 



2 Hyperbolic automorphisms of the 2-torus 

An algebraic automorphism of the 2-torus = R^/Z^ (the standard projection 
]R2 will be denoted by p) is defined by a matrix A e SL(2,Z) or .4 G 

GL(2, Z). Initially, A acts on and then this action projects onto T^. Namely, 
A defines a toric automorphism 

A: Ap{x) = p{Ax), i.e. A{x + I?) = Ax + 1?. 

A and A are called hyperbolic if for the eigenvalues A,/U oi A one has |A| > 
1, < 1. Let E\ be the unstable eigendirection for A, i.e. a line Me in 
where Ae = Xe; later we shall also need the stable eigendirection E'^ = Re' 
where Ae' = ^e' . Denote by W^'* the projections of E^'^ to T^. They are dense 
on the torus. Projections of the lines parallel to E^/^ constitute an unstable 

^^At least the mechanism maJjing many systems chaotic. We do not claim that there can 
be no other sources of chaoticity. 
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(expanding), rcsp. stable (contracting) foliation W^'^ on T^; it consists of the 
lines obtained from W^'^ under the actions of the group shifts. (We shall need 
W^'^ only in Parts 3 and 4.) 

Figure 1 is a "standard" illustration for the hyperbolic automorphism of T^. 
It concerns A = ( f } ) and presents the action of A on a figure C in a funda- 
mental square [0,1]^ (Fig- la)- Traditionally, C represents a cat's silhouette, 
so-called "Arnold's cat". On the covering plane an image of C imder the ac- 
tion of A partially leaves [0, 1]^ (Fig. lb), so we cut it into several pieces and 
return them into the unit square by shifts (x, y) i— > {x + m,y + n) with m,n 
(Fig. Ic). Figure Id illustrates mixing property of this map: for any measurable 
sets X and Y one has mes{A"X HY)^ mes(X) mes(y) as n — >■ cx). This 
means that a proportion of Y occupied by A^X is approximately the same as 
the proportion of the entire torus occupied by X (equivalently, A^X). We see 
that if X = C and F is a quite large rectangle then even for n = 3 this equality 
holds with good precision. 

Map A of the torus is in an evident sense expanding along W\ (expanding 
in the direction of W^), so one has the same phenomenon of quickly increasing 
uncertainty as it happens for the expanding circle map / from Part 1 does. 
Thus it is not surprising that the dynamical system {^4"} on also resembles 
some stochastic processes. 

Many "stochastic" features of {A"} were revealed dealing with this system 
itself. But now the most lucid way of revealing them is to use the so-called 
"Markov partitions" introduced (in this case) by R.Adler and B.Weiss^^. They 
will be considered in the next part. Here we dwell on another question. If we are 
interested in hyperbolic automorphisms of T^, then why not to try to classify 
them? 

It is reasonable to consider two objects related to as "similar" or "equiva- 
lent" if there exists a homcomorphism : ^ transforming one object into 
another. This makes sense if we can speak about the action of on the objects 
considered. For the map ^ : — > it is reasonable to say that transforms 
A into tlio nuip ^ o _1 c }'' So for the automorphisms A, B of the two-torus 

^•^Thcrc exists a more general version of the Markov partitions. First step towards its 
elaboration was made by Ya.G.Sinay (partially together with B.M.Gurcvich), final version 
is due to R.Bowen. He elaborated it for general hyperbolic sets. Subsequent steps were to 
introduce (and to use) the analogous partitions (also called "Markov") for several objects 
which arc not hyperbolic sets but which resemble them in some important aspects — pseudo- 
Anosov maps, Lorcnz attractors, some billiards ... The works of various authors where these 
steps were made could be very good, but as it concerns the general idea of the Markov 
partition, essentially here we meet not so much a further development of this general idea, 
but rather its adopting to a somewhat new situation. 

We shall speak only about the case considered by Adler and Weiss. It is more simple 
and lucid geometrically than these generalizations and modifications. (Some exception is the 
pseudo-Anosov case which is also two-dimensional and also admits sufficiently understandable 
pictures. (A.Yu.Zhirov even provided an album with such pictures — to appear at the site of 
the Steklov Inst.) But this case in more complicated in its essence and, in our opinion, much 
has be done in this case before it will become compatible to the classical one in all respects.) 

As A maps x into A{x), it is reasonable to say that ip transforms A to the map which 
maps ^p{x) to ip(Ax), 
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wc consider B as "similar" to A if and only if there exists a homeomorphism ip 
such that B = (fi o A o (p^^ . Then for the induced maps 

(B)*,(l)*,<p. :^^l(T^z)^i^l(T^z) (lo) 

of the one- dimensional homology group we have 

(B)* = o {A)^ o 

It is well known that under a suitable (and the most natural) choice of the basis 

in _ffi(T^, Z) maps (10) are described by matrices A, B and some C G GL(2, Z). 
Thus we have to deal with the usual conjugacy of matrices A and B. Of course 
now the conjugacy has to be performed via a matrix C that itself belongs to 
SL(2,Z) or GL(2,Z). Conversely, ii B = CAC'^ with C G GL(2,Z), then 
B = CAC~^. So we arrive at the question: given hyperbolic A and B, how to 
decide whether they are conjugate in GL(2,Z)? 

If we consider a more broad conjugacy: A ^ B li and only B = CAC~^ with 
some C G GL(2, C), one can find the answer in a usual course of linear algebra. 
A necessary condition for such equivalence is that A and B have the same 
eigenvalues. And if eigenvalues of a matrix are different (what is the case for 
our A and B), this condition is also sufficient. Moreover, if the eigenvalues are 
real (what is also the case for our A, B), then the conjugacy can be performed 
via a real matrix, i.e. there exists C G GL(2,]R) such that B = CAC~^. 

But we want to have C G SL(2,Z) or G GL(2,Z). It turns out that this 
really is an additional requirement. 

This was known to Gauss. Indeed, Gauss reduced the question to the ques- 
tion in the theory of binary quadratic forms. The last question was solved by 
him. Now we describe this reduction. 

Let q = (A, B, C) be a quadratic form. For our consideration, we suppose 
all coefficients of quadratic forms to be integer. We define its action on a vector 
z = {x, yy as q{z) = Ax'^+Bxy+Cy'^. Further, a discriminant of the quadratic 
form q is denoted as discg an is equal to B^ — 4AC. We denote by Q{D) the 
class of all quadratic forms with discg = D. The group SL2{Z) acts on Q{D) 
by natural formula 

{9*q){z) = q{9-'z). 

On the other hands, this group acts on sets H±{t) of all hyperbolic automor- 
phisms with a given trace t and a given determinant ±1 by conjugation: 

Now we construct a bijection /: H{t) —>■ Q{t^ — 4) such that the following 
diagram is commutative. 

<fi (11) 
This diagram performs the desired reduction. 
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Now, to prove (11), put f{X){z) = disc(dct(z, Xz)), here {z,Xz) is a 2 x 2- 
matrix consisting of two columns z and Xz. Firstly, by direct calculation we 
obtain 

so disc(/(X)) = t2 _ 4detX = t2 =p 4. Then, for any form q = {A,B,C) G 
Qit"^ =F 4) there exists a unique X = (ct-a) S H±{t) such that f{X) = q. 
Indeed, c = A, b = —C, a = {t — B)/2, and to check a to be integer we note 
that B"^ — t^ = AAC — 4, so B t are of the same parity. 
Finally, prove the diagram to be commutative: 

f{ag{X)){z) = det{z,gXg-'z) = det{g)det{g-h,Xg-h) = 

= det{g) ■ f{X){g-h) = det{g) ■ ig*{f{X))){z), 

so since det{g) = 1, the proof is completed. 

But we prefer to present an answer to our question (not only the statement 
of this answer, but also the way leading to it) in terms more specific for our 
framework. It seems that this rephrasing of Gauss' result and his arguments 
should be well-known, but we don't know any references on this matter. 

Let E"^ be as before (the unstable eigendirection for A). As a line on R^, 
it has equation x = kajj, with ka being a quadratic irrationality. According to 
Lagrange, its continued fraction expansion is periodic: 

Ka = [do', 0,1, a2,..., ttk, Qfc+li • ■ • ; ak+q , Qfc+g+l; ■ • ■ ; Qfc+2g , • • •] = 

= [ao; ai, a2, • • • , o-k, {o-k+i, Ofe+g)] (12) 

(ife+ig+j ~ o,k+j for i > 0, i = l,...,q). By "the period" of this contin- 
ued fraction we shall mean not only g, but also the finite sequence of numbers 
(ttfc+i, . . . , Ofe+g) up to a cyclic permutation. The final result about the conju- 
gacy is: 

A is conjugated to B via some C € GL(2,Z) if and only if the continued 
fraction expansions of k,a and k,b have the same period (i.e. the same periodic 
part) . 

Here follows a brief sketch of the proof. It is based on the following three 
facts. 

a) Quadratic irrationalities k, ki have the same period if and only if ki can 
be obtained from k, by applying to k, some sequence of the following transfor- 
mations: 

Ti{k) = k+1, T2{k) = -, T3{k) = -k 

K 

and their inverses. This easily follows from the formulas 



Ti([ao;ai,a2, . . .]) = [ao -|- 1; ai, 02 
T2{[ao;ai,a2,as,...]) = < 



[ai;a2,a3,...], if ao > 0, 
[0;ao,ai,a2, . . .], if ao = 0, 
^ (some cases for ao < 0), 
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T3([ao;ai,a2,a3,...]) 




1; 0,2 + 1. 03, . . .], if oi = 1, 

1; l,ai - 1,02,03, . . .], if oi ^ 1. 



We do not present all cases for due to large number of tlieni. This cases, 
where n is negative, can be obtained from the formula T2{k) — T3(T2(T3(k))). 
Here in the right-hand side T2 is applied to — k > 0. Note also that even in 
these cases o„ with large numbers shift by odd number of positions (±1 or ±3). 

b) K^.j^Q-^ = Ti{KA), where^** 



1 y ' \^ 1 / ' V 1 

c) These Ci arc generators of GL(2,Z). 

Thus if ka and kb have the same period for some A^B ^ GL(2, Z), then due 
to statement a) ka can be obtained from by a sequence of transformations 
Tf^^. So A is obtained from B by conjugation with a corresponding product of 
matrices (because of b)). 

Conversely, c) implies that '\i B = CAC~^ with some C € GL(2,Z), then B 
can be obtained from A by conjugation by some product of Cf^ and so ka and 
Kb have the same period. 

As regards to the conjugation via C G SL(2,Z), we shall mention only the 
following: 

If the period q ( "the length of the periodic part" ) of the continued fraction 
expansion for ka is odd, and A ~ S via some C G GL(2,Z), then A ~ B via 
some D G SL(2,Z); 

if the period is even and ^ ~ B via some C G GL(2, Z) \ SL(2, Z), then there 
is no D G SL(2, Z) conjugating A and B. 

Both statements are simple consequences of the following ones: 



^■^Here is a slightly more sophisticated point of view on the relations between Tj and Cj. 

a 3 

The standard action of the nondcgcncratc matrices C — ' 



( ^1 )^u,= ( "'1 )=Cz 



defines also their action on the jHojcctive line RP^ considered as the space of the straight lines 
passing through the origin: simply L ^ C{L). On 

RP^ \ { the horisontal line 102 = 0} 

we have the natural coordinate k = k{L) that is the slope of L (so L is described by the 
equation 21 = kZ2 mentioned above. One can associate to a horizontal line the symbol 00 
having in mind the usual agreements about the algebraic operations with 00.). Then for a 
line L 

k{C{L)) = . 

7k(L) + 5 

Denote the fractional linear transformation k i— > by T(C) (we can extend it to the 

whole MPi taking T(C)oc = ^.^ but we do not need this). Then T{Ci) = Ti, i = 1, 2, 3. It 
remains to add that C{E^) = E^^^_j^. 
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(a) if q is odd, there exists a matrix C G GL(2, Z) such that det C = — 1 and 
A = CAC-^; 

(b) if q is even and A = CAC'^ with some C e GL(2, Z), then det C = 1. 
Indeed, when wc apply the operations T2 or TJ^j to ka, this leads to a shift on 
one position left or right of all coefHcients of the continued fraction expansion 
for ka with sufficiently large number: n-th coefficient o„ goes to the (n + l)-st 
or {n — l)-st place. When we apply Ti, a„ remains on the n-s place. Here we 
speak about the "fate" of an individual coefficient under the action of Tj on ka- 
This needs some care, but can be justified for a„ with large n. On the other 
side, det Ci = 1, det C2 = det C3 = 1, so for any C G GL(2, Z) 

, , „ . C shifts the "tail" of continued fraction for ka by an 

detC = 1 <s=^ 1 r 

even number of positions. 

So, if the period is even, then any transformation that maps ka to ka 
should shift its "tail" by qt [t e Z) positions that is even number. Therefore, 
determinant of a corresponding matrix should be equal to 1. 

On the other hand, if this period is odd then it is not difficult to make sure 
that there exists a sequence of transformations that shifts "tail" exactly by q 
positions (so, determinant of the matrix should be —1). 

For example, li A = ( 1 i ) , then ka = ^'^^ = [(1)] and so ka = = 

T.2T{\ka). 

Consequently, A = (C2C]~^)A(C2C]~^)~^ (what can be checked directly), 
where det(C2Cf ^) = -1. 



3 Markov partitions for hyperbolic 
automorphism of 2-torus 

First we shall define Markov parallelograms. 

a) A Markov parallelogram in the plane (for a hyperbolic A G GL(2,Z)) is 
a parallelogram 11 in IS? having two sides parallel to E"^ (let us call these sides 
"unstable" , or "expanding" , and denote their union by d^H) and two other sides 
parallel to E% (let us call these sides "stable" , or "contracting" , and denote their 
union by 9*11). 

b) A Markov parallelogram in the torus (for a hyperbolic automorphism A) 
is a projection P = pH of some Markov parallelogram 11 c (for the related 
A) provided that interior int 11 projects injectively.^^ By the "interior" of P one 
often understands the image P° — p{mt 11) of the interior int 11.^^ Projections of 
the unstable (stable) sides of 11 are called the unstable (stable) sides of P, their 
union is denoted by a"P (d^P); so P \ P° = d'^P U d^P. Unstable (stable) 
sides of P are arcs of the leaves of the one-dimensional foliations O^a) 
introduced in the beginning of Part 2). 

^^Two opposite sides of 11 may project onto two partially overlapping arcs. 
^® Because of what is said in the previous footnote, P° may be slightly less than the true 
interior int P on torus. 
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A Markov partition V = {Pi, . . . , Pk} (for A) is a partition of consisting 
of a finite number of Markov parallelograms Pi provided this system of j)arallel- 
ograms satisfies two conditions concerning its behavior with regards to A. These 
conditions arc formulated below. But first wc must make a warning. Strictly 
speaking, "partition" here is not a partition in a literal sense, i.e. a decompo- 
sition of into a system of non-intersecting sets. In our case this means that 
sides of two parallelograms can have common points. Two unstable sides (or 
two stable sides) of two different parallelograms can partially overlap, they also 
can have a single common point. A stable side of one parallelogram and an 
unstable side of another also can have a finite number of common points. Here 
is a more brief formulation of the requirement on P, : P° do not intersect each 
other and \ {P° n . . . U P^) is a finite union of arcs lying on leaves of . 
Points of this set can be considered as exceptional ones. The set of exceptional 
points is negligible in many aspects (e.g. from the measure-theoretical point of 
view) and at the same time this set admits a more or less concise description 
and thus can be taken into attention if necessary. 

Now we shall formulate two conditions on the behavior of V with respect to 

A. 

I. Each contracting side of any APi lies on a contracting side of some Pj. 
Each expanding side of any Pj lies on an expanding side of some APj (i.e. on 
the image of an expanding side of Pj). 

The same can be expressed in terms of the system of Alarkov parallelograms 
Ilj in mentioned in the definition of Markov parallelograms Pj in . This 
version of condition I is almost literally the same as the version formulated in 
terms of P^; one needs only to have in mind that in order to get a partition of 
R^, one must take Ilj + (m, n) with all m,n €Z and i = 1, . . . , fc. 

Another condition can be more pictorially formulated in terms of M? . 

II. For all i.j = 1, . . . ,k only one of the intersections AHi fl (Ilj -|- (m, n)) 
with all TO, n e Z can have nonempty interior. 

In terms of this condition claims: 

Any nonempty AP° n P° consists of only one connectivity component. 
Refinements of this notion. 

A) Markov partitions in the strict sense (strMp) — the Markov partitions 
in the sense as defined above. 

B) Quasi-Markov partitions (qMp). Assume we are given two different direc- 
tions in such that the straight lines going in these directions have irrational 
angular coefficients. (They are not assumed to have any relation to any A — 
now we do not have any A at all.) Denote by E^,E^ the straight lines go- 
ing through (0,0) in these directions. Let W^''^ = piE^'"^) and let W^'^ be 
one-dimensional foliations consisting of all group shifts of W^'"^ (i.e. obtained 
by projecting to ah lines parallel to E^'^). Replacing w«'« in 
the part of the definition of the Markov parallelograms and Markov partitions 

^''They concern only our case (hyperbolic automorphisms of the 2-torus, not the Markov 
partitions for more general or related objects mentioned in one of the footnotes in Part 2). 
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preceding I, II by E^''^,W^''^,W^''^, we get a definition of a qMp (for the two 
directions given). 

Let us prove that there exists no qMp consisting of merely one element, i.e. 

of one Markov parallelogram. (Later wc shall sec that there are qMp consisting 
of two elements. Such qMp's can be considered as the simplest ones.) 

Look at any point A that is a corner of this parallelogram P. In a small 
neighborhood of A boundary of P is a union of two segments, one is parallel to 
El, another is parallel to E2. Thus there are three possibilities: both segments 
have their ends in A (like in letter L); one pass through A, another ends there 
(like in T); both pass through A (like in X). 

In the first case our parallelogram should have an angle larger than 180°. 
Indeed, lift A to some point A on the plane, choose point close to A that lies in 
more-than-180° angle and then consider the lifting 11 of the parallelogram that 
contains this point. Then 11 is obviously not convex. 

In the second case without loss of generality we can suppose that segment 
parallel to E2 pass through A and segment parallel to Ei starts in A and goes in 
direction we call positive. Also we arbitrarily fix positive direction on E2. Any 
lift n of the parallelogram has four corners. Note that each corner is uniquely 
defined by directions of sides (there are two possibilities for a direction of edge 
parallel to Ei that starts at the corner and two possibilities for one parallel to 
E2). So we see that two corners of 11, that is, (positive Ei, positive E2) and 
(positive El, negative E2) project into point A. Thus, difference between their 
coordinates on the plane is (i,j) G Z^. But they share the same edge of 11, 
which has direction Ei. So, this direction has rational slope i/j, that is not 
true. 

In the third case this argumentation also works, since all corners of 11 maps 
to the same point A, hence both directions Ei^2 are rational. 

C) Pre-Markov partition (preMp). Like strMp, it is also related to some 
hyperbolic automorphism A, but in its definition the condition II is omitted. 

Let V = {Pi, . . . , Pk} be a strMp for A. Then V defines the following coding 
of points of and their trajectories. 

A point a; e is coded by a bilaterally infinite sequence {i„; n G Z} 
such that A"'{x) e Pi„ for all n. Strictly speaking, this coding is univalent for 
the points of the set fl^T^-oo ^"(-Ff n...nP^) which is of the "full measure" 
(its complement has the Lebesgue measure 0). Exceptional points need some 
special care, like points with binary rational cyclic coordinates in Part 1), and 
even more care — now the "good" definition of the coding for them involves 
some precautions which were absent there (see below). But still they do not 
make a big harm. 

We shall describe the precautions mentioned above right now, and later we 
shall explain why they are taken. The previous attempt to define the bilateral 
sequence (a„) corresponding to a point x € is equivalent to the following 
receipt: 

X (a„) if and only if A'^{x) G Pa^ for all n G Z. 
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In other words, 

N 

X ^ {an) if and only if x e f] A'^iPaJ for all iV e Z+ (13) 

n=-N 

(compare to (5), (6)). Correct definition is 

x^{an) if andonlyif xeclosf f] A-"(P°^)J for all G Z+, (14) 

where clos denotes the closure. For "unexceptional" points x G Hj^-cc ^"(-Pi U 
. . . U P^) this definition coincides with the previous one, but if A" a; G dPi for 
some n,i, then for such x the new definition is more restrictive. 

It is important that different points have different codings. Thus all what 
happens in the dynamical system (T^,il) is somehow reflected in the coding. 

Codes of all points constitute some subset of {1, . . . , A:}^. It turns out that 
it is a so-called Markov subset. Markov subsets themselves are deflned indepen- 
dently of the toric automorphisms. Here follows their definition. 

Any Markov subset corresponds to some subset A C {1,...,A:}^. Pairs 
(z,j) G A are called "admissible", other pairs — "forbidden". Given A, we 
define the related Markov set M c {1, . . . , fc}^ as a set of all doubly (bilaterally) 
infinite sequences such that (i„,i„+i) G A for all n. M is easily seen to 
be a closed subset of {1, ... , k}^ (the latter endowed by topology similar to the 
topology used in Part 1) invariant with respect to the (bilateral) topological 
Bernoulli shift (also defined analogously). The pair (M, ctm), where ctm is the 
restriction ctm = is called the topological Markov shift. The probability 

theory and the ergodic theory supply an extensive information about (M, ctm)- 

For a Markov subset M "coding" points of a pair is admissible when 

A{P°) n P° ^ 0, i.e. int {A{Pi) n 7^ 0. The main step of proving that M 

actually is the Markov subset corresponding to this set of admissible pairs is 
the following: 

if A{P°) n P° ^ 0, A{P°) n ^ 0, then A^P° n AP° n P^° ^ 0. 

If we had called "admissible" all those points (i, j) for which APi C\ Pj ^ 
(what would correspond to (13)), then we would have to know that 

if 1(P,) n Pj ^ 0, A{Pj) nPh^0, then l^P, n APj nPh^0. (15) 

But generally the last statement is wrong. This explains why one has to define 
the coding for "exceptional" points according to (14). 

Here is an example demonstrating that generally (15) is wrong (sec Fig. 2). For conve- 
nience we assume A and fi to be positive. Denote K = (—5,5)^- Clearly (K+('m, n))nK = 0, 
if (m, n) g Z\{(0, 0)}. (The closure of ftT is a fundamental domain.) The straight line E'^ cuts 
clos if into two trapeziums K' and K" (we consider them as being closed sets). Let Markov 
parallelograms IIi, 112 be such that 

Ui C K', U2 C K", d^Ui n d^U2 3 (the origin) 
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Figure 2: To example showing (15) to be wrong. 



(so that d'Tli for both i contains a small arc of passing through 0), and let Hi be so small 
that yl^(ni) UA(Ui) C K' . Finally, let AII2 intersect a third Markov parallelogram lis lying 
completely in intK". Then 

A{Pi)3A0 = (the zero of the group T^), A{Pi)nP2 ^ 0, A(P2) n P3 ^ 0, 

but A^{Pi)n A{P2) n P3 = and even A (Pi ) n P3 = 0, because the only "congruent (with 
respect to shifts on the elements of Z^) copy" of Ha lying in K is II3, which lies in intK", 
while A'^(Ili)nK" = 0. 

In this argument we took as granted that there exist Markov partitions with sufficiently 
small Pj. One can get such partition beginning with some Markov partition and passing 
successfully several times from one Markov partition to another by means of the following two 
operations: 

(i) passing from a Markov partition {Pi, . . . , Pfe} to the Markov partition consisting of 
intersections APj n Pj with nonempty interiors; 

(ii) passing from a Markov partition {Pi, . . . ,Pfc} to the Markov partition consisting of 

intersections A^^ Pi n Pj with nonempty interiors. 

Originally we were interested in the dynamical system {A"} on T^. It 
turns out that the dynamical system {aJJ^} on M provides a symbolic model 
for the previous system which is of the same character as the symbolic model 
for (S"^,/) in Part 1. There exists a continuous map tt : M ^ such that 
7r(the code of x) = a; and ttoctm = Aott. Preimage of the Lebesgue measure on 

is a measure ^ on M invariant with respect to ctm- [M, aM, is a Markov 
process in the usual sense of the probability theory, x = € M describing 

the elementary event with the current state xq. 

A highly nontrivial "purely measure theoretical" theory of D. Ornstein leads 
to the conclusion that two Markov processes satisfying some additional condi- 
tions which are fulfilled in our case are isomorphic in the measure theoretical 
sense if (and only if — this was known before) they have the same entropy. Pass- 
ing back to the toric automorphisms, we can conclude that (T^,^) and {T^,I3) 
are isomorphic in the measure-theoretical sense^^ if and only if they have the 

e. there exists a map : — » which is an automorphism of the measure spa<;e 
(T^, mes) and such that B = (p o Ao (p~^. 
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same eigenvalues. (It's because the entropy in this case is equal to log2 |A| where 
A is an eigenvalue such that |A| > 1.) Compare this with the more complicated 
situation concerning the topological conjugacy of A, B described in the previous 
part. 

Another example of the use of coding. Besides /U, probability theory pro- 
vides many other measures v which are invariant with respect to um and such 
that (M, (Tm,!^) is also a Markov process. They can be projected to and 
this supplies us with new invariant measures for A. (While the invariance of 
the Lebesgue measure with respect to A is clear, existence of other invariant 
measures is by no means trivial.) 

Unfortunately, the ergodic theory leads to the conclusion that usually a 
strMp has to consist of rather many elements Pi — their number k cannot be 
less than |A|; otherwise the diversity of motions (trajectories) in {T^,A) cannot 
be reproduced in (M, <jm)- From the other side, any A has a preMp consisting 
of two elements only. If we shall use this preMp for "coding" in the same way 
as it was done for a strMp, it will turn out that two different points x, y have 
the same coding and the set of such {x, y) is by no means "small". But there is 
a modification of the coding process which is a remedy for this defect. 

Given a preMp V, we define 

V' = {closures of nonempty connected components of AP° n 

(Relations between elements of V and V are better seen on B?.) V' turns out 
to be a strMp. Thus it defines a "good" coding. This coding can also be seen 
and described in terms of V alone as follows. Associated with a preMp V there 
is a oriented multigraph F: 

• vertices of F are parallelograms Pi ; 

• there is an oriented edge e from Pi to Pj if and only if AT\i fl (Ilj + (to, n)) 
has nonempty interior; 

• if int(Anj n (IIj + (to, n)) ^ for several (to, n), then corresponding to 
them there are edges going from Pi to Pj (so each edge corresponds to 
some PI G V). 

In terms of P', the pair (p^q) is admissible if and only if int AIIp n (11^ + 
(to, n)) for some m, n e Z. In terms of F this looks quite geometrically; the 
end of Bp (i.e., the edge corresponding to P^) is the beginning of c^. An infinite 
path in F is just a sequence of edges {eu^} such that all pairs {eh„ ^ Gh^+i) a-re 
admissible, i.e. that after coming to a vertex along eh„ , we continue our path 
along the edge eh„+i- 

There exists a simple construction of the simplest prcMp, i.e. those con- 
sisting of 2 elements. Basically it is the construction of qMp consisting of 2 
elements for two directions E^''^ with irrational angular coefficients. 

It begins from choosing some system of data. First, it includes choosing 
of an "initial point" P G (let P = p{Q)) and choosing one of two lines 
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+ Q,E'^ + Q which arc parallel to E^, E"^ and arc passing through Q. Let 
for the definiteness E^ + Q he chosen (in the case when we choose E'^ + Q, 
everything is going on analogously — so to speak, and E"^ exchange their 
roles). Choose one of two rays of + Q beginning at Q and denote it by i. 

Essential for the construction is an arc / of p{E^ + Q) = + P which 
passes through P = p{Q) and has endpoints A, B such that 

— A is the first (after P) intersection of p{L) with /, 

— S is the second intersection of p{L) with /. 

Let us parameterize L by parameter t so that (for the definiteness) the value 

of t corresponding to a point z G L equals to the length of the straightlincar 
segment Pz; such z we denote by z{t). Then our crucial condition on A and B 
is: 

A = p{z{tA)), B = p{z{tB)), where Ta,b are such that < tA < ts and 

p{z{t)) iOT0<t<tB, t^tA. 

Let us call this system of data — P,L and / — the T-configuration (we think 
of / as of the crossbar of the letter T and of L — as of the vertical line (leg) 
of T). 

One needs some argument in order to prove that conditions about the intersections of 
p(L) with / can be satisfied by means of the proper choice of /. Begin with the arbitrary arc 
J of p{E^ + Q) = + P containing P inside itself. Consider subsequent intersections of p{L) 
with J. Let them correspond to the values ti of the parameter t, where < ti < t2 < ■ ■ ■ ■ 
Note that p{z{ti)) are dense on J. Take 

i = min{j; p{tj) and p{tj^i) lie on J on the opposite sizes of P}. 

For C, D & J denote by d{C, D) the length of the arc of J between points C and D. Let 

min d{z{tj), P) be achieved at j = h. Then we can take 
0<3<i 

= th, = i + 1, A = z{tA), B = zits), I = the arc of J between A and B. 

A T-configuration defines some qMp in a natural way. Namely, let C be the 

next after B point of the intersection of p{L) and / (it is an interior point of 
/). It turns out that the arc PC of p{E'^ + Q) = W"^ + P and the arc / of 
p{E^ + Q) = + P divide into two Markov parallelograms (for directions 
of £;i'2). 

To prove this we use the following idea. Move Ij in the direction 62 (e2 is 
a unit vector in E2 that have the same direction as L): Ij{t) = Ij + te2- For 
small t > Q set Ij{t) n 7r~^(Z?) contains only endpoints of Ij{t). Wc proceed 
until this holds and at some moment we have a "catastrophe" . It is clear that 
"catastrophe" (i. e. change of the set {Ij {t)r\'K~^{D))—te2) can occur only at the 
moments with I jit) n7r~"'^(/) 7^ 0. Such moments are discrete (each component 
of 7r~^(/) produce at most one such moment and only compact part, which 
contains finite number of components, can contribute on a finite interval of 
time). Therefore there is the first moment t* when {{Ij{t) H it~^{D)) — 
changed, with two cases, Ij{t) fl 'k~^{D) is either one point or a segment. I the 
first case there is no "catastrophe" , as if for t = t* + e one endpoint of Ij (t) 
doesn't belong to D, then at t = t* it coincides with C, and if there is a new 
point in Ij{t) D Tr-^{D) ior t = t* + s then P lies in I{t*). 
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In the second case wc have again two possibilities: cither I jit*) C Tr~^(I) or 
mtlj{t*) contains an endpoint z of 7r~^(/). But in the latter case z — ee^ G -D, 
so mtlj{t* — ff) n 7r~^(£>) ^ 0. Thus, the former case takes place and Mj = 
Uo<«<r -^j (^) ^ connectivity component of \ D. 

It remains to prove that these Mi^2 are the only connectivity components. 
Consider any z gT'^\D and move it in the direction (—62) till the first inter- 
section with D at some moment t. Then z — te2 G int Ij for some j = 1,2 and 
therefore z' = z — {t — e)e2 G Mj. So we have a path {z — re2}xe[o,t-e] in \ £) 
that connects z with a point in Mj. Thus z G Mj-. 

Inversely, any two-element qMp (for directions of E^'"^) can be obtained 
in such way by means of a suitable T-configuration. The proof use the same 
technique as the proof on non-existence of qMp into one parallelogram. 

So, we choose directions on _E^'^ in arbitrary way, E^'^ are their rays of 
corresponding direction started at (0, 0). Also we define W^'^{P) = piE"^'^ +Q) 
ifP = piQ). 

Then we consider any point P where two segments of parallelograms bound- 
ary intersects. As before, we have three possibilities: both have their ends here 
(L); both segments pass through P (X); one pass through, one ends in P (T). 
Clearly, L-case can't take place, as one of the figures separated by these lines 
has angle of more that 180°. 

In X-case we prolong all four lines until they belongs to the boundaries and 
obtain four points P^'^ . Note that P+-^ belongs to the segment of d^(V) that 
ends there and belongs to VF^'^(P+'^), and to the segment of d^{V) that passes 
through this point. So, near all four points P^'^ the boundary has T-type, with 
directions of the "leg" of this T being different. Thus, these five points are 
different. Count the corners of the parallelograms: two near each P^'-', four 
near P (and some also may be in other points), totally at least 12, not 8. So, 
this case also can't take place. 

In T-case we can assume without loss of generality that "leg" of T belongs 
to W~^'^{P). Similarly, we obtain four different corners on the boundary: P, 
P"*"'-*^, P~'^, P+'^, and because in these points we have already 8 corners, there 
are no other corner on the boundary. Each segment of the boundary has two 
ends, and these ends are T-points, which are diff'erent for different segments. So, 
boundary consists of two segments: / = p-^ip+.i on ii^^-direction and PP+'^ 
in .E^-direction. So, points P^'^ lies on PP+'^. It is clear that P, L = W+'^{P) 
I comprise T-construction that produces given c|Mp. 

If we are given a hyperbolic automorphism A of T^, then this construction 
with E^ = E", E^ = E\oi E^ =E'^, E^ = E" gives a preMp for A, provided 
that P is a fixpoint for A. 

4 Classification of the simplest preMp 

Besides the conjugating of toric automorphisms by means of toric automor- 
phisms, we shall consider their conjugating by means of affine diffeomorphisms 
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of T^, i.e. by means of maps 

z C{z) = Bz + g, 

where B are toric automorphisms and g gT^. In other words, C is obtained by 
projecting to an afhne map of the plane — a map 

z C(z) =Bz + b with B e GL(2,Z) and b € p'^(g). 

We shall need only the case^when the result of the conjugating of a toric au- 
tomorphism A by means of (7 is a toric automorphism again (actually we shall 
demand even more). It is easy to see that this is the case if and only if B~^6 is 
a fixpoint of A. 

If C acts on the objects O from some class of objects {O}, then it is natural 
to say that the pair 

{A, an object O somehow related to A) 

is equivalent to {CAC^^,C{Oj} (provided it is true that C{0) is related to 

CAC~^ in the same way as O is related to A). 

liV = {P^} is a preMp for 1, then CV = {CP,} is a preMp for CAC-^: 
if sides of Ilj are parallel to E^'^ , then sides of CAC~^{CIii) are parallel to 

^CABC-l = -^-^1' ^CAC-^ — B^A'i 

if APi n Pj are "good" , then 

CAC-^{CPi) n CPj = C{APi n Pj) 

are also "good". 

From this point we impose an additional condition on prcMp. Since a con- 
tracting segment of its boundary maps into itself, there is a fixed point on it (as 
segment is compact). Due to the same reason for inverse transform the expand- 
ing segment also has a fixed point. In our examples these two fixpoints are the 
same one placed in one of the four joint points ("vortexes") of contracting and 
expanding segments, i. e. the following condition holds: 

III. There is a fixpoint that belongs to an intersection of stable and unstable 
segments. 

We call such preMp's to be "of vertex type" . There are also preMp's without 
this condition with different fixpoints on expanding and contracting segments, 
they are called to be "of edge type" . Vertex-type preMp's appears to be a source 
for description of all preMp's, this will be discussed at the end of this Part. 

So, from now on until near the end of this Part, we will consider only vertex 
preMp's without any special mention. 

Let CAC~^ = A (what means that BAB~^ = A, i.e. B commutes with A, 
and 6 is a fixpoint of A). In this case we consider a preMp V and a preMp CP 
as equivalent ones. Question: What is the number of the equivalence classes of 
the simplest preMp for A7 Answer is given by the following theorem. 
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Figure 3: "island" (a) and "parquet" (b) types of preMP's. 



Theorem 1. In terms of (12) (see Part 2), there are 



2(afc+i + . . . + ak+q) = 2{sum of the Oj in the period). 



classes of (vertex) preMp's, 2q (twice the length of the period) of them are of 
the "island" type, others are of the "parquet" type. 

Two types mentioned in the theorem differs by topological properties of their 
lifting to the plane. For "island" typo there are parallelograms which arc bigger 
"in all directions" (let it be Hi + (rn, n)) and they constitute a connected set 
("ocean" ^{Ili + {m, n))); a union of other parallelograms [Jm nO^^ + im, n)) 
is disconnected and its connected components are these 112 + (m, n) ("islands"). 
(See Figure 3a.) 

For "parquet" type preMp both sets \J^^„(n.i + {m,n)) and Um,n(n2 + 
(to, n)) have infinitely many connected components each consisting of infinitely 
many parallelograms; each component resembles a stripe. (Sec Figure 3b.) 

In the textbooks one can meet only the island type preMp. This is because 



golden mean). Its continued fraction expansion is [(1)] = [1; 1, 1, . . .]. So there 
are 2 simplest preMp's of the island type and no simplest preMp's of the parquet 
type.i^ 

As far as we know, first picture with preMp of the parquet type was published 
by E. Rykken. But, as far as we understand, she did not discuss when such 
preMp's can appear. 

Now we get an outline of a proof of this result. 

First, we can consider only partitions with fixpoint from condition III being 
an origin O. (For a shift of the torus to any vector from any fixpoint to another 
one commutes with the transform.) 

Further, at a small neighborhood of O boundaries forms two segments, one 
passes through O, another has its end there. So we have four broad classes of 

■"-^Notc that (ii) = (io)^'^'-' each of equivalence classes with respect to centralizer is split 
into two equivalence classes with respect to the group {±A" | n G Z}. 




In this case k.a 
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prcMp's distinguished by a direction of the latter segment (e„, e^, — e„, — e^). 
But all preMp's from the last two classes are equivalent to preMp's for the first 
two of them by an automorphism —id. 

So, let us consider one of the first two classes, say e-u-class. We are going to 
prove that there are S equivalence classes, L of which are of "island" type, in 
this broad class. 

Lemma 1. All preMp's from the broad class form a double infinite sequence 

...,P_i,Po,Pi,P2,... (16) 

such that for their stable and unstable boundary segments J^'** following state- 
ment holds: 

-"fe -"fe+lJ -'fe -'fe+1- 

Proof. Let x{t) be a solution of x = e„ with .t(0) = (so x{t) is a point 
moving along W'^{0) with a constant velocity). Denote by (t„) a sequence of 
all instants of time t > when x{t) G /. Here I C W^{0) is a starting segment 
in T-construction. In this terms we can easily describe a T-construction applied 
to any J c I. Indeed, a points Aj and Bj can be described as the points 
^(*rt^) S J ^^'^ x{tnB) & J, nA < ns with a following properties: 

There are no n < such that 

x(tn) lies on / between a;(t„^) and a;(/:„g). (l^a) 
There are no m < such that 

x{tm) G J and x{tnB) lies on / between x{tm) and O. (17b) 

So, if and P(2) are two preMp's and / = I^-^-^ U /^*2) we can apply this to 
J = /j^^^ and J = 7^2) • Without loss of generality ubi < ns^ (hence ^^s^ C -^(2))- 
So x{tnA^) and x{tnB^) can't lie between A2 and B2, whence J^^^ D I^^y Thus 
an order 

P(l)^P(2) ^ %C/("2) 

is linear. Moreover, each preMp P(i) >- P corresponds to some number ub from 
conditions (17). So, any "right tail" {{P' \ P' >- Pq}, >-) is isomorphic as ordered 
set to (N, >). Then the entire set of preMp's is isomorphic either to (N, >) or 
to (Z, >). The former case is eliminated due to absence of an initial element in 
the order: for quite long (in both directions) initial segment / the segment AB 
corresponding to it is arbitrary long (due to density of W'^{0)). □ 

Lemma 2. A (or —A if Xu < acts on the sequence (16) as a shift: A{Pk) = 

Pk+s- 

Proof. A conserves the order Shifts are the only automorphisms of the 
ordered set (Z, >). □ 

For further we need to consider a structure of a centralizer of Ai. e. a group 
C{A) = {B€ GL2(Z) I AB = BA}. 
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Figure 4: Four consecutive preMp's for y 
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Lemma 3. Suppose that A G Gi2(Z) is a hyperbolic matrix. Then there exists 

B e GL2{1) such that C{A) ^ {±B" | n G Z}. 

Proof. There exists a matrix D G Gi2(R) such that A = D-^AD = (^q" 
Then X = D~^{C{A))D is a subset of a centrahzer of A in 0X2(1^), which is 
equal to {(0 °) I ^JJ- G ^*}. Since a conjugacy M i-> D~^MD is a home- 
omorphism of GL2(K), X is a discrete set. Moreover, as detM = ±1 for 
M G GL2{Z) this set is a subset ofF = {(o)l) I A/x = ±l}. Therefore, a 
projection tt: (o{1) i-^Ais 2:l-map, so 7r(X) is a discrete subgroup of W. 

So we have two possibihtics: 7r(X) = {a" | n G Z} or 7r(X) = {±a" | n G Z}. 
The former can't take place because ( _"]^) & X. Lifting of the latter to Y 
yields either X = {±( "q" ) | n G Z} or X = { ( ) | n G Z} (signs are 
independent). 

Suppose the latter case takes place. Then F ~ (0-1) ^ ^- Therefore, 
DFD-^ G G{A) c GL2(Z). But DFD~^ has e„ as an eigenvector with eigen- 
value equal to 1 . This means that the ratio of its coordinates should be rational, 
so we have a contradiction. 

Thus, X = {±(g|g) |nGZ} and the statement of the lemma is true for 
B = D(SO)£)-i. □ 

Matrix B from the statement of the previous lemma can be easily described 
in terms of continued fractions. 

Lemma 4. Let — {uj,l), uj — [bo, . . . ,bn-i,{ai, . . . jUl)]-^'^ Then B from 
Lemma 3 can be chosen equal to CDC~^ , where 

/-i r^borp /-^b2/~i /-^fen-i/o 7-) /-ia\/-i r^a2/~< f~ia,i^ f~i 

O — J2<-^1 <-^2<-^i 02 . • . O2, iJ — *-^2<-^i O2 . . . (-yj O2. 

(Matrices Ci^2, which correspond to elementary operations Ti{co) = u) + 1 and 
T2{w) = 1/co, were defined in Part 2.) 

Proof. Denote CDC^^ by B' . We can sec that e„ is an eigenvector of B', so 
is also an eigenvector (since they are algebraically conjugated, as well as their 
eigenvalues), so B' commutes with A. 

Each matrix in C{A) acts on continued fraction of cj as a shift, and the map 
d: C{A) LZ that maps a matrix to the magnitude of the corresponding shift 
is a group homomorphism. As B' maps to L, d should be an epimorphism. 
Thus B should maps to L or to —L. Then d~^(L) = {±-6} in the former 
case and d~^{L) = {±B~^} in the latter one. In all cases B' = ±5=*=^, so 
C{A) = {±B'" Inez}. □ 

Now we pass to a central point of the proof: an interrelation between the 
continued fraction of co and preMp's. 

Lemma 5. L Let a starting segment I of T- construction be sufficiently short. 
Then all preMp's with I'' G I can be described as follows. If A' , B' are lifts of 

^"We also define 6j^, for fe > n as follows: ui = [bo, . . . , 6„_i, 6n, bn+i) • • ■ ]• 
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a. 




W'{0) 



b. 




Figure 5: The "butterfly" (a) and transformation of 
(e„,/)-qMp (b) into (e„, J)-qMp (c). 



A and B that belongs to W'"-{Q,Q) then A' (correspondingly, B' ) lies on lifts of 
I C W^(0) that consist {pk,qk) {corr., (Ipk +Pk-iJqk + 9fc-i)), where 1 <l < 
bk+i and Pn/Qn = \bo, • • • ,&n] is n-th convergent for to. Conversely, each such 
pair of points for sufficiently large k corresponds to some preMp. 
2. k's and I \s for preMp's will he arranged in (16) as follows: 



...,{k-l, bk), {k, 1), (k, 2), . . . , (fc, bk+i), (/c + 1, 1), . . . , (fc + 1, bk+2), ■ ■ ■ 



3. preMp is of "island" type iff it corresponds to {k,l) with I = bk+i- 
4- B' acts on a sequence (16) as a shift to S = ai + ■ ■ ■ + aL positions. 

Figure 4 illustrates this lemma. There are four consequent members of se- 
quence (16) for A= ( 1 1 ) (here k = [0, (2, 1)]). One can sec that Fig. 4d presents 
the image of preMp from Fig. 4a under A (the bold parallelogram is an image of 
the unit square). Thus A shifts sequence (16) to three positions, and "islands" 
and "parquets" form a sequence {P,I,I) = . . . , P, I, I, P, I, I, P,I,I,... as it 
follows from the statements of the lemma. 

Note also that the last two statements imply that there are exactly S equiva- 
lence classes (we recall that now only e„-type prcMp's are considered), L of them 
comprises of "island"-type preMp's. Their link to e^-type preMp's will finish 
the proof by Lemma 6 below. 

Proof. Let / be so small that different "butterflies" on the plane don't intersect. 
Here "butterfly" is defined as a union of two triangles (with their interior), 
the boundary of each consists a connected component of / \ {O}, a horizontal 
segment passing through O and segment parallel to e„ (see Figure 5a). 

Thus there is a l:l-correspondence between qMp's generated by T-construc- 
tion for I and those generated by T- const ruction for a horizontal segment of the 
"butterfly" . (It is denoted by J.) This correspondence is shown on Figures 5b-c. 
It is well-defined since all transformations are inside the "butterfly", which is 
injectively mapped into plane. Note also that the relation between OA and OB 
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is the same as one between OA' and OB', this will be useful to find a type of 
the partition. 

By the same reasoning as in the proof of Lemma 1, one can obtain that 
points A and B for any preMp with I"^ C I are x{tnA b) that satisfy condition 
(17a). As Cu = (w, 1) and J belongs to an x-axis, all tn are integers. So, this 
condition can be reformulated as such: ar-coordinates of A and B are equal to 
qA,Bi^ — Pa,b (with < Qb) such that 

e [g'AW - PA, Qbco - pb]; (18a) 

there are no {p', q') with q' < q such that q'u) — p' £ [gA<^ — Pa, qsi^ — Pb]- 

(18b) 

Consequently, both (pa,9a) and {pB,qB) satisfies a following condition: 

there are no {p', q') such that < q' < q and q'u) — p' & [0, qu) — p]. (19) 

Such pairs {p,q) (or, more commonly, fractions p/q) are called one-sided best 
approximations to w of second type. Similarly, pairs (p, q) satisfying a condition 

there are no {p', q') such that < < g and \q'oj — p'\ < \qu) — p\, (20) 

are called ( two-sided) best approximations to u of second type. 
We state a theorem from number theory describing them. 

Theorem 2. 1. If uj = [bo.bi, . . . ,] then one-sided approximations are p/q = 
[bo, . . . bk-i, I], where 1 < I < bk- They are arranged as 

[1], [2], . . . , [bo], [bo, 1], . . . , [6o, [6o, 6i, 1], . . . , [6o, 6i, 62], • • • (21) 

/rom below from above from below 

with denominators growing in the sequence. 

2. Two-sided approximations are only the following ones: 

[bo], [bo, bi], [bo, 61, 62], . . . , [bo, 61, ... , 6„], . . . (22) 

This theorem seems to be well-known and can be proved in the way similar 
to the classical theorem on two-sided approximations (see, e.g., [Kh]). 

Thus, PA/qA and ps/qB are fractions from (21). However condition (18) is 
stronger. Obviously it can be expressed as such: there is no approximations 
from the same side as Pa/<?a between PA/qA and PB/qB in sequence (21). 

Consequently, 

PA/qA = [bo, bi,...,bk], Pb / qB = [bo, ■ ■ ■ ,bk,l], (23) 

where 1 < ^ < 6^+1. This proves the first two statements of the lemma. (Actu- 
ally it remains to prove that 

r;, u 1 Pkl+Pk~i .^ 

[bo,---,bk] = —r, • (24) 

qkl + qk-i 
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This can be done by induction over k.) 

Third statement is also simple. A parallelogram with its base on the segment 
OA has height qs and one with base on OB is of height qa- Thus if OA' > OB' 
this preMp is of "island" type and otherwise it is of "parquet" type. (Recall 
that when we return back to AB segment a type of the partition remains the 
same.) Statement 2 of Theorem 2 implies that the former case takes place only 
if ; = &fe+i in (23). 

Fourth statement of the lemma obviously follows from a fact that B' maps 
{Pk,qk) to {pk+L,qk+L)- □ 

To finish the proof of Theorem 1 it remains to proof that the number of 
e„-type preMp classes are equal to the number of those of eg-type. If trivially 
follows from the next (and the final one) lemma. 

Lemma 6. PreMp's of Cu-type and of Cs -type can be bijectively corresponded in 
such a way that any preMp can be mapped to its correspondent by a shift on the 
torus. 

Proof. Each preMp has 4 joint points on its boundary (one of each type). So 

we should just shift it to place the required joint point to the origin. The result 
will be preMp, so we define two mutually inverse maps (one from e„-preMp's 
to e^-preMp's, another is reverse). So there is a l:l-correspondence. □ 

Now we will shortly discuss a preMp's with two different fixpoints on the 
boundary. They really appears at least for some automorphisms. For example, 
let us consider a standard (f J) -automorphism A and its large degree B = A^ . 
Then B has quite many fixpoints, which are quite densely placed on torus. Now 
get any preMp (for A) of e„-type and shift it to vectors — £e„. If fixpoints are 
quite densely placed on torus, for a rather small e the stable segment of shifted 
preMp will pass through a fixpoint. On the other hand, as this shift is quite 
small, the origin will retain on an unstable segment. 

Similarly to Lemma 6 it can be proved that any preMp (with an arbitrary 
position of its fixpoints) can be obtained from, say, some e„-type preMp by some 
shift. The number of preMp's obtained from one can be found algorithmically 
as the number of points of a lattice in a parallelogram. Indeed, if P is a joint 
point of the e„-type, and U = P + a;e„ and S = P + ycg are fixpoints then 
xcu — yCs belongs to a lattice of all fixpoints. Thus we have a parallelogram of 
points of the form xcu — yCg (as x and y are restricted to some segments) and 
each point of fixpoints lattice corresponds to a preMp. 
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